Skip to content

Harden Photos sync output path handling#220

Open
MrJarnould wants to merge 14 commits intotimlaing:mainfrom
MrJarnould:codex/photos-safe-sync-paths
Open

Harden Photos sync output path handling#220
MrJarnould wants to merge 14 commits intotimlaing:mainfrom
MrJarnould:codex/photos-safe-sync-paths

Conversation

@MrJarnould
Copy link
Copy Markdown

Breaking change

None.

Proposed change

This PR hardens Photos CloudKit sync path handling so remote photo metadata and persisted sync state cannot cause local filesystem writes or deletes outside the configured sync directory.

The sync engine now sanitizes remote resource filenames, normalizes folder-structure output into safe relative path components, and resolves final target paths through a containment check before writing downloaded files, applying local EXIF/XMP metadata updates, or deleting stale local files during --auto-delete.

This addresses a SonarCloud path-injection finding around EXIF metadata writes by fixing the path construction boundary rather than only changing the highlighted write call.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New service (thank you!)
  • New feature (which adds functionality to an existing service)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests
  • Documentation or code sample

Example of code:

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue: SonarCloud pythonsecurity:S2083 path-injection finding in Photos CloudKit materialization

Tested locally with:

ruff format pyicloud/services/photos_cloudkit/sync.py tests/services/test_photos_sync.py
ruff check pyicloud/services/photos_cloudkit/sync.py tests/services/test_photos_sync.py
UV_CACHE_DIR=/tmp/uv-cache UV_NO_EDITABLE=1 uv run --no-sync pytest tests/services/test_photos_sync.py -q
UV_CACHE_DIR=/tmp/uv-cache UV_NO_EDITABLE=1 uv run --no-sync pytest tests/services/test_photos.py tests/services/test_photos_sync.py tests/test_cmdline.py -q

Results:

18 passed
235 passed

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

  • Documentation added/updated to README

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

Warning

Rate limit exceeded

@MrJarnould has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 38 minutes and 51 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 38 minutes and 51 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e5b480eb-4af1-4a58-af77-9f32f935e52d

📥 Commits

Reviewing files that changed from the base of the PR and between f424a10 and 9759afa.

📒 Files selected for processing (6)
  • pyicloud/common/cloudkit/client.py
  • pyicloud/services/photos_cloudkit/materialize.py
  • pyicloud/services/photos_legacy.py
  • tests/services/test_photos.py
  • tests/services/test_photos_cloudkit_client.py
  • tests/services/test_photos_sync.py
📝 Walkthrough

Walkthrough

Replaces the monolithic Photos implementation with a modular CloudKit-backed and legacy photos service, adds a typed CloudKit client, a full photo sync engine with state persistence and materialisation, expands CLI photo commands (libraries/get/changes/sync/sync-cursor/watch), and adds extensive tests and fixtures.

Changes

Cohort / File(s) Summary
Documentation & Examples
README.md, examples.py
CLI/install examples unprompted, expanded “Photos CLI” docs, migration guidance, updated upload examples and example code to use api.photos.upload(...) and timestamped album names.
CLI Photos Commands
pyicloud/cli/commands/photos.py
New Typer commands (libraries, get, changes, sync-cursor, sync, watch); list/download accept --library; centralized init helpers and scoped album/library error messages; JSON/text output and watch-mode streaming.
CLI Normalisation
pyicloud/cli/normalize.py
Added normalization helpers for libraries, detailed photo metadata, change events, per-item sync actions and aggregated sync results for CLI JSON output.
CloudKit Core
pyicloud/common/cloudkit/*
pyicloud/common/cloudkit/client.py, .../models.py, __init__.py
New typed CloudKit transport and CloudKitContainerClient, response models for zones/database changes, and specific HTTP error/429/auth exception types.
Photos CloudKit Service
pyicloud/services/photos_cloudkit/...
Complete CloudKit-backed photos subsystem: public API (__init__.py), client adapter, constants/enums, query builders, record mappers, models, materialisation (XMP/EXIF/RAW alignment), state backend (SQLite/in-memory), sync engine (run/watch), and service/library/album/asset classes.
Photos Legacy Service
pyicloud/services/photos_legacy.py
Legacy, browser-style photos implementation retained for shared-stream/back-compat use; re-exports and separated from CloudKit flow.
Photos Facade
pyicloud/services/photos.py
Converted to a facade re-exporting types from photos_cloudkit and photos_legacy, removed prior in-file implementations.
Tests & Fixtures
tests/fixtures/..., tests/services/*, tests/test_cmdline.py
Large set of new JSON fixtures for CloudKit interactions and many new/expanded tests: CloudKit client unit tests, sync engine integration tests (state, auto-delete, materialisation, path-safety), and extended CLI tests for new commands and library-scoped flows.
Examples & Test Helpers
examples.py, tests/test_cmdline.py
Example code updated to use timestamped album names and new upload API; test fixtures and fake service objects expanded to model multiple libraries and sync behaviours.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant PhotosService
    participant CloudKitClient
    participant PhotoSyncState
    participant Filesystem

    User->>CLI: photos sync --library root
    CLI->>PhotosService: sync(options)
    PhotosService->>PhotoSyncState: open state DB / get cursor
    PhotosService->>CloudKitClient: database_changes(since=cursor)
    CloudKitClient-->>PhotosService: zone/asset records
    loop for each asset
        PhotosService->>PhotoSyncState: check resource manifest
        alt resource not current
            PhotosService->>CloudKitClient: download_asset_bytes(url)
            CloudKitClient-->>PhotosService: bytes
            PhotosService->>Filesystem: write file (atomic)
            PhotosService->>PhotosService: apply EXIF/XMP/RAW policies
            PhotosService->>PhotoSyncState: upsert resource
        else
            PhotosService-->>PhotoSyncState: record skip
        end
    end
    PhotosService->>PhotoSyncState: set_sync_cursor(new_cursor)
    PhotosService-->>CLI: return sync result
    CLI-->>User: display summary / JSON
Loading
sequenceDiagram
    participant User
    participant CLI as CLI<br/>photos watch
    participant Service as PhotosService
    participant Sync as run_photo_sync
    participant Output as UI

    User->>CLI: photos watch --iterations 3 --interval 30
    CLI->>Service: watch(options, interval_seconds=30, iterations=3)
    loop i from 1 to 3
        Service->>Sync: run_photo_sync(options)
        Sync-->>Service: PhotoSyncResult
        Service->>Output: print iteration status and results
        alt not last iteration
            Service->>Service: sleep(30)
        end
    end
    Service-->>CLI: finished
    CLI-->>User: exit
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • PR #199 — Extends CloudKit infrastructure and typed client usages; closely related to CloudKit client/models additions here.
  • PR #203 — Modifies CLI photos commands and related packaging; strongly related to the CLI refactor in this change.
  • PR #55 — Changes album/folder fetching/pagination in the photos service; likely to conflict with the new photos service refactor.

Suggested reviewers

  • timlaing

Poem

🐰 I hopped through folders, byte by byte,

CloudKit hummed and synced all night,
Cursors kept a steady beat,
Albums named with timestamps neat,
Watch and sync — what a fluffy sight! 📸✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.91% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title 'Harden Photos sync output path handling' clearly and specifically describes the main change: improving path safety in the Photos sync feature.
Description check ✅ Passed The pull request description is comprehensive and directly related to the changeset, explaining the hardening of path handling, security implications, and testing performed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

@coderabbitai coderabbitai Bot added the enhancement New feature or request label Apr 21, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🧹 Nitpick comments (1)
tests/fixtures/photos_album_create_response.json (1)

1-16: LGTM! Valid CloudKit album creation response fixture.

The JSON structure is well-formed and correctly represents a CloudKit album creation response. The base64-encoded album name and response format are appropriate for testing the album creation flow.

Given the PR's focus on path-safety hardening, you may want to ensure that your test suite also includes fixtures with edge cases such as album names containing path traversal sequences (../), special filesystem characters, or unusually long names—though these may already be covered in other test files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/fixtures/photos_album_create_response.json` around lines 1 - 16, Add
additional test fixtures alongside photos_album_create_response.json covering
edge-case album names: create variants where the fields.albumNameEnc value is
base64 for path traversal sequences ("../"), names with special filesystem
characters (e.g., ":" "\" "*" "?" "<" ">" "|"), and an unusually long name
(e.g., >255 chars) to validate path-safety hardening. Ensure each fixture keeps
the same CloudKit structure (records[*].recordName, recordChangeTag,
fields.albumNameEnc, fields.isDeleted) so tests can reuse parsing logic that
reads albumNameEnc and confirm sanitization or rejection behavior in the code
that handles album creation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples.py`:
- Around line 451-453: The code constructing album_name uses deprecated
datetime.utcnow(); update the expression to use a timezone-aware call by
replacing datetime.utcnow().strftime(...) with
datetime.now(timezone.utc).strftime(...) (ensure timezone is imported) so
album_name generation (the f-string around datetime/... and uuid4().hex[:8])
remains unchanged but uses datetime.now(timezone.utc) instead of the deprecated
utcnow.

In `@pyicloud/cli/commands/photos.py`:
- Around line 944-976: The CLI prints the "Starting photo watch" message before
the iterator's internal sleep because the sleep happens when
next(watch_payloads) resumes; to fix, either (A) move the wait/sleep
orchestration out of the iterator and into this loop so you call
_print_photo_watch_wait() then sleep and only then call next(watch_payloads) and
_print_photo_watch_start(), or (B) change the iterator that produces
watch_payloads (watch_photo_sync) to yield a payload immediately and perform its
sleep after yielding (i.e., yield before sleeping) so the CLI's
_print_photo_watch_start() reflects actual start timing; update references in
this file to use the chosen approach and ensure _print_photo_watch_wait,
_print_photo_watch_start, and next(watch_payloads) are called in the corrected
order.

In `@pyicloud/common/cloudkit/client.py`:
- Around line 131-137: The asset GET error path currently maps 429 responses to
a generic CloudKitApiError; change it to detect HTTP 429 and raise
CloudKitRateLimited (same as post()) so callers can back off, extracting the
Retry-After header from resp.headers (e.g., retry_after =
resp.headers.get("Retry-After")) and include it on the CloudKitRateLimited
exception while leaving 401/403 to raise CloudKitAuthError and other >=400
responses to raise CloudKitApiError (preserve existing payload extraction via
getattr(resp, "text", None)); update the logic near the code that currently
raises CloudKitAuthError/CloudKitApiError for asset GETs to implement this 429
handling.

In `@pyicloud/services/photos_cloudkit/client.py`:
- Around line 141-147: Wrap the model validation calls in batch_count and
upload_file so validation errors are converted to CloudKitApiError: in
batch_count, move PhotosBatchCountResponse.model_validate(raw_data) into the try
block that currently catches the extraction error, catch Exception as exc, and
raise CloudKitApiError("Photos count query failed", payload=raw_data) from exc
(don't reference data since it may not exist); likewise in upload_file wrap the
relevant Response model_validate call in a try/except and raise CloudKitApiError
with payload=raw_data when validation fails, preserving the existing error
message patterns and using CloudKitApiError as the public boundary.

In `@pyicloud/services/photos_cloudkit/mappers.py`:
- Around line 28-40: The current branch encodes value with value.encode("ascii")
outside the try block causing UnicodeEncodeError for already-decoded non-ASCII
strings; change the str branch to set raw = value (keep bytes branch as-is) and
move the ASCII encoding into the guarded block so the code attempts
base64.b64decode(raw.encode("ascii")).decode("utf-8") inside the try, on failure
return raw (the original string) in the inner fallback, and only log via
LOGGER.debug(record_name(record), field_name) and return None as the final
fallback; reference symbols: value, raw, base64.b64decode, LOGGER.debug,
record_name, field_name, record.

In `@pyicloud/services/photos_cloudkit/materialize.py`:
- Around line 411-432: _jpeg_has_exif_datetime currently forces little-endian
when reading the first IFD offset which breaks big-endian (MM) EXIF; change the
code to detect the TIFF byte order from the EXIF payload (check
exif_payload[0:2] == b"MM" for big-endian vs b"II" for little-endian) and pass
the correct byte order to _read_uint32 (instead of the hardcoded b"<"), so the
initial call that computes the first IFD offset in _jpeg_has_exif_datetime uses
the detected byte order before calling _parse_tiff_ifd and subsequent tag
readers.

In `@pyicloud/services/photos_cloudkit/service.py`:
- Around line 1283-1286: The fullname method may KeyError or recurse forever for
missing or cyclic parents; update Album.fullname to defensively resolve parent
chain: iterate up using self._parent_id and lookups on self._library.albums (use
a safe lookup that does not raise, e.g., AlbumContainer.get or try/except around
AlbumContainer.__getitem__), build path segments iteratively while tracking
visited parent ids to detect cycles, and stop/return a sensible fallback (e.g.,
the current name or a partial path) when a parent is missing or a cycle is
detected instead of recursing. Ensure you reference and update the fullname
method on Album and use the _parent_id and self._library.albums symbols for
locating the lookup and cycle detection.
- Around line 1477-1515: The legacy branch in _get_len blindly indexes
response["batch"][0]["records"][0]["fields"]["itemCount"]["value"] and can raise
IndexError/KeyError; update _get_len to defensively parse the JSON from the
POST: verify response is a dict, that "batch" exists and is non-empty, that
batch[0] contains "records" with at least one entry, and that that record has
"fields"→"itemCount"→"value"; if any of those checks fail or an exception
(IndexError/KeyError/TypeError/ValueError) occurs, fall back to a safe default
(e.g., return 0) or—if self._client is available—call the typed path
self._client.batch_count(container_id=self._get_container_id,
zone_id=self._zone_id); ensure the function returns an int and does not let raw
indexing errors propagate.

In `@pyicloud/services/photos_cloudkit/sync.py`:
- Around line 203-215: The current logic allows assets to be marked
asset_ready_for_delete while --until-found is enabled, which can skip resources
and prematurely delete remote assets; modify the delete-determination code (the
block that sets/uses asset_ready_for_delete in sync.py, including the loop/logic
referenced around the later 308-385 range) so that when options.until_found is
not None you never mark an asset as ready for remote deletion unless all of its
resources have been explicitly confirmed local (or simply disable
auto-delete/keep_icloud_recent_days behavior when options.until_found is set).
Specifically, guard the branch that sets asset_ready_for_delete with a check
against options.until_found (or add a flag that requires full-resource
confirmation) so remote deletion paths are skipped if until_found could have
skipped resources.
- Around line 217-229: The code currently always creates the output directory
via options.directory.mkdir(...) even in preview modes; change it to only create
the directory for real sync runs by wrapping the mkdir call in a condition that
checks not (options.dry_run or options.only_print_filenames). Keep the rest
intact (PhotoSyncResult construction, create_photo_sync_state call and its
ephemeral logic) so only the directory creation is gated while other behavior
remains the same.
- Around line 511-517: The loop over state.iter_resources() short-circuits based
only on path existence; update it to also compare the tracked/stored size from
the resource entry against the actual filesystem size before deciding the
resource is current. In the block that uses _safe_target_path(directory,
entry.relative_path) and checks path.exists(), fetch the expected size from the
entry (e.g., entry.size or the resource size field on entry) and compare it to
path.stat().st_size (or equivalent); if sizes differ, return False so
truncated/replaced files don't incorrectly pass. Ensure PhotosServiceException
handling remains the same.

In `@pyicloud/services/photos_legacy.py`:
- Around line 724-733: The libraries property makes a POST to
f"{self.service_endpoint}/changes/database" but omits self.params; update the
session.post call inside BasePhotosService.libraries (the block referencing
self.service_endpoint, session.post, data="{}", headers={CONTENT_TYPE:
CONTENT_TYPE_TEXT}) to pass the CloudKit request params by adding
params=self.params so the request includes required account/request flags for
proper library discovery.

In `@tests/fixtures/photos_browser_mutations/README.md`:
- Around line 8-10: In the README sentence "Stable placeholder values are used
instead so request and response relationships remain testable without exposing
personal data.", add a comma after "instead" to read "Stable placeholder values
are used instead, so request and response relationships remain testable without
exposing personal data." Edit the sentence in the README entry that contains the
phrase "Stable placeholder values are used instead so..." to insert the comma
for improved readability.

In `@tests/services/test_photos_sync.py`:
- Around line 784-795: The test currently passes added_date=None which
DummyAsset coerces back to asset_date; instead, after constructing recent_asset
and old_asset (created via DummyAsset), explicitly delete their added_date
attributes (e.g., del recent_asset.added_date and del old_asset.added_date) so
the code under test must fall back to asset_date; update references to
recent_asset and old_asset accordingly to ensure the filter exercises the
fallback path.
- Around line 680-704: The test currently relies on datetime.now() via
DummyAsset which makes the "2026:" assertion flaky; fix by creating a fixed
asset_date (e.g., datetime(2024, 6, 1, ...)) and pass it into DummyAsset so the
generated EXIF is deterministic, then update the assertions to expect the chosen
year string (e.g., "2024:") instead of "2026:". Locate the test where
DummyAsset("asset-exif", "photo.jpg", payloads=...) is created and modify it to
accept the fixed asset_date, keeping run_photo_sync and PhotoSyncOptions usage
unchanged.

---

Nitpick comments:
In `@tests/fixtures/photos_album_create_response.json`:
- Around line 1-16: Add additional test fixtures alongside
photos_album_create_response.json covering edge-case album names: create
variants where the fields.albumNameEnc value is base64 for path traversal
sequences ("../"), names with special filesystem characters (e.g., ":" "\" "*"
"?" "<" ">" "|"), and an unusually long name (e.g., >255 chars) to validate
path-safety hardening. Ensure each fixture keeps the same CloudKit structure
(records[*].recordName, recordChangeTag, fields.albumNameEnc, fields.isDeleted)
so tests can reuse parsing logic that reads albumNameEnc and confirm
sanitization or rejection behavior in the code that handles album creation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: be4b8642-c2ff-4018-8f3d-116c45b7f62b

📥 Commits

Reviewing files that changed from the base of the PR and between 7abec01 and 708710c.

📒 Files selected for processing (66)
  • README.md
  • examples.py
  • pyicloud/cli/commands/photos.py
  • pyicloud/cli/normalize.py
  • pyicloud/common/cloudkit/__init__.py
  • pyicloud/common/cloudkit/client.py
  • pyicloud/common/cloudkit/models.py
  • pyicloud/services/photos.py
  • pyicloud/services/photos_cloudkit/__init__.py
  • pyicloud/services/photos_cloudkit/client.py
  • pyicloud/services/photos_cloudkit/constants.py
  • pyicloud/services/photos_cloudkit/mappers.py
  • pyicloud/services/photos_cloudkit/materialize.py
  • pyicloud/services/photos_cloudkit/models.py
  • pyicloud/services/photos_cloudkit/queries.py
  • pyicloud/services/photos_cloudkit/service.py
  • pyicloud/services/photos_cloudkit/state.py
  • pyicloud/services/photos_cloudkit/sync.py
  • pyicloud/services/photos_legacy.py
  • tests/fixtures/README.md
  • tests/fixtures/photos_album_create_response.json
  • tests/fixtures/photos_album_membership_query_core.json
  • tests/fixtures/photos_album_membership_response.json
  • tests/fixtures/photos_album_rename_response.json
  • tests/fixtures/photos_all_photos_query_core.json
  • tests/fixtures/photos_all_photos_response.json
  • tests/fixtures/photos_browser_mutations/README.md
  • tests/fixtures/photos_browser_mutations/album_add_photo_request.json
  • tests/fixtures/photos_browser_mutations/album_add_photo_response.json
  • tests/fixtures/photos_browser_mutations/album_create_request.json
  • tests/fixtures/photos_browser_mutations/album_create_response.json
  • tests/fixtures/photos_browser_mutations/album_delete_request.json
  • tests/fixtures/photos_browser_mutations/album_delete_response.json
  • tests/fixtures/photos_browser_mutations/album_remove_photo_request.json
  • tests/fixtures/photos_browser_mutations/album_remove_photo_response.json
  • tests/fixtures/photos_browser_mutations/album_rename_request.json
  • tests/fixtures/photos_browser_mutations/album_rename_response.json
  • tests/fixtures/photos_browser_mutations/photo_delete_request.json
  • tests/fixtures/photos_browser_mutations/photo_delete_response.json
  • tests/fixtures/photos_database_changes_response.json
  • tests/fixtures/photos_favorites_query_core.json
  • tests/fixtures/photos_favorites_response.json
  • tests/fixtures/photos_indexing_not_finished_response.json
  • tests/fixtures/photos_live_photo_response.json
  • tests/fixtures/photos_missing_counterparts_response.json
  • tests/fixtures/photos_recently_added_query_core.json
  • tests/fixtures/photos_recently_added_response.json
  • tests/fixtures/photos_shared_library_all_photos_query_core.json
  • tests/fixtures/photos_shared_library_all_photos_response.json
  • tests/fixtures/photos_shared_library_favorites_query_core.json
  • tests/fixtures/photos_shared_library_favorites_response.json
  • tests/fixtures/photos_shared_library_private_zones_response.json
  • tests/fixtures/photos_shared_library_shared_zones_response.json
  • tests/fixtures/photos_shared_library_unfavorite_request.json
  • tests/fixtures/photos_shared_library_unfavorite_response.json
  • tests/fixtures/photos_shared_library_zone_changes_request.json
  • tests/fixtures/photos_shared_library_zone_changes_response.json
  • tests/fixtures/photos_upload_duplicate_response.json
  • tests/fixtures/photos_upload_skeletal_response.json
  • tests/fixtures/photos_video_only_response.json
  • tests/fixtures/photos_zone_changes_response.json
  • tests/fixtures/photos_zones_list_response.json
  • tests/services/test_photos.py
  • tests/services/test_photos_cloudkit_client.py
  • tests/services/test_photos_sync.py
  • tests/test_cmdline.py

Comment thread examples.py Outdated
Comment thread pyicloud/cli/commands/photos.py Outdated
Comment thread pyicloud/common/cloudkit/client.py
Comment on lines +141 to +147
data = PhotosBatchCountResponse.model_validate(raw_data)
try:
return data.batch[0].records[0].fields.itemCount.value
except Exception as exc:
raise CloudKitApiError(
"Photos count query failed", payload=data.model_dump(mode="json")
) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Inspect response model validation sites and malformed-payload test coverage.
rg -nP -C3 'PhotosBatchCountResponse\.model_validate|PhotosUploadResponse\.model_validate|malformed|invalid JSON|CloudKitApiError'

Repository: timlaing/pyicloud

Length of output: 23106


🏁 Script executed:

cat -n pyicloud/services/photos_cloudkit/client.py | sed -n '135,150p'

Repository: timlaing/pyicloud

Length of output: 797


🏁 Script executed:

cat -n pyicloud/services/photos_cloudkit/client.py | sed -n '180,195p'

Repository: timlaing/pyicloud

Length of output: 619


🏁 Script executed:

# Check if there are tests specifically for validation errors (not extraction errors)
rg -n 'def test.*upload.*malformed|def test.*validation|ValidationError' tests/services/test_photos_cloudkit_client.py

Repository: timlaing/pyicloud

Length of output: 43


🏁 Script executed:

# Check what test_upload_file_raises_cloudkit_error_for_invalid_json actually tests
rg -A 15 'def test_upload_file_raises_cloudkit_error_for_invalid_json' tests/services/test_photos_cloudkit_client.py

Repository: timlaing/pyicloud

Length of output: 711


🏁 Script executed:

# Check the full _response_json implementation
cat -n pyicloud/services/photos_cloudkit/client.py | sed -n '149,169p'

Repository: timlaing/pyicloud

Length of output: 937


🏁 Script executed:

# Look for tests that might test schema validation failures (not JSON parsing failures)
rg -B 2 -A 10 'test.*batch_count.*malformed' tests/services/test_photos_cloudkit_client.py

Repository: timlaing/pyicloud

Length of output: 481


🏁 Script executed:

# Check if PhotosBatchCountResponse and PhotosUploadResponse are Pydantic models
rg -n 'class PhotosBatchCountResponse|class PhotosUploadResponse' pyicloud/services/photos_cloudkit/

Repository: timlaing/pyicloud

Length of output: 327


🏁 Script executed:

# Verify that the exception would indeed be ValidationError by checking imports
rg -n 'from pydantic|import.*ValidationError' pyicloud/services/photos_cloudkit/client.py

Repository: timlaing/pyicloud

Length of output: 43


🏁 Script executed:

# Check imports at top of photos_cloudkit/client.py
head -30 pyicloud/services/photos_cloudkit/client.py

Repository: timlaing/pyicloud

Length of output: 786


🏁 Script executed:

# Check the models to confirm they are Pydantic models
rg -B 2 'class PhotosBatchCountResponse' pyicloud/services/photos_cloudkit/models.py | head -10

Repository: timlaing/pyicloud

Length of output: 154


🏁 Script executed:

# Look at the broader pattern in CloudKitContainerClient to see how validation errors are typically caught
rg -B 5 'ValidationError' pyicloud/common/cloudkit/client.py | head -40

Repository: timlaing/pyicloud

Length of output: 1505


Normalise model-validation failures too.

Malformed successful JSON can still escape as a validation exception here, because only HTTP/JSON parsing and the nested count extraction are wrapped. Keep the public boundary as CloudKitApiError for bad Photos payloads as well.

This follows the pattern established throughout CloudKitContainerClient, where validation errors are caught and normalised (see query, lookup, zones_list, modify, and changes methods).

The two locations requiring fixes are:

  • Lines 141–147 (batch_count): Wrap model_validate in the try-except block and use raw_data in the error payload (since data won't exist if validation fails).
  • Lines 184–186 (upload_file): Wrap model_validate in a try-except block to catch and normalise validation errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyicloud/services/photos_cloudkit/client.py` around lines 141 - 147, Wrap the
model validation calls in batch_count and upload_file so validation errors are
converted to CloudKitApiError: in batch_count, move
PhotosBatchCountResponse.model_validate(raw_data) into the try block that
currently catches the extraction error, catch Exception as exc, and raise
CloudKitApiError("Photos count query failed", payload=raw_data) from exc (don't
reference data since it may not exist); likewise in upload_file wrap the
relevant Response model_validate call in a try/except and raise CloudKitApiError
with payload=raw_data when validation fails, preserving the existing error
message patterns and using CloudKitApiError as the public boundary.

Comment thread pyicloud/services/photos_cloudkit/mappers.py
Comment thread pyicloud/services/photos_cloudkit/sync.py
Comment thread pyicloud/services/photos_legacy.py
Comment thread tests/fixtures/photos_browser_mutations/README.md
Comment thread tests/services/test_photos_sync.py
Comment on lines +784 to +795
recent_asset = DummyAsset(
"asset-recent",
"recent.jpg",
asset_date=datetime.now(timezone.utc) - timedelta(hours=1),
added_date=None,
)
old_asset = DummyAsset(
"asset-old",
"old.jpg",
asset_date=datetime.now(timezone.utc) - timedelta(days=10),
added_date=None,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make this test actually remove added_date.

DummyAsset replaces added_date=None with asset_date, so this test currently exercises filtering by added_date, not the intended fallback to asset_date.

Proposed fix
     old_asset = DummyAsset(
         "asset-old",
         "old.jpg",
         asset_date=datetime.now(timezone.utc) - timedelta(days=10),
         added_date=None,
     )
+    recent_asset.added_date = None
+    old_asset.added_date = None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
recent_asset = DummyAsset(
"asset-recent",
"recent.jpg",
asset_date=datetime.now(timezone.utc) - timedelta(hours=1),
added_date=None,
)
old_asset = DummyAsset(
"asset-old",
"old.jpg",
asset_date=datetime.now(timezone.utc) - timedelta(days=10),
added_date=None,
)
recent_asset = DummyAsset(
"asset-recent",
"recent.jpg",
asset_date=datetime.now(timezone.utc) - timedelta(hours=1),
added_date=None,
)
old_asset = DummyAsset(
"asset-old",
"old.jpg",
asset_date=datetime.now(timezone.utc) - timedelta(days=10),
added_date=None,
)
recent_asset.added_date = None
old_asset.added_date = None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/services/test_photos_sync.py` around lines 784 - 795, The test
currently passes added_date=None which DummyAsset coerces back to asset_date;
instead, after constructing recent_asset and old_asset (created via DummyAsset),
explicitly delete their added_date attributes (e.g., del recent_asset.added_date
and del old_asset.added_date) so the code under test must fall back to
asset_date; update references to recent_asset and old_asset accordingly to
ensure the filter exercises the fallback path.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
pyicloud/services/photos_cloudkit/client.py (1)

191-193: ⚠️ Potential issue | 🟡 Minor

Normalise upload response validation failures.

PhotosUploadResponse.model_validate(...) can still leak a Pydantic ValidationError for malformed successful upload payloads. Keep the Photos client boundary consistently on CloudKitApiError, as batch_count now does.

🐛 Proposed fix
-        data = PhotosUploadResponse.model_validate(
-            self._response_json(response, context="Photos upload")
-        )
+        raw_data = self._response_json(response, context="Photos upload")
+        try:
+            data = PhotosUploadResponse.model_validate(raw_data)
+        except ValidationError as exc:
+            raise CloudKitApiError(
+                "Photos upload response validation failed",
+                payload=raw_data,
+            ) from exc

Run this read-only check to confirm the upload path is wrapped and covered:

#!/bin/bash
# Description: Verify upload response validation handling and related tests.
sed -n '177,199p' pyicloud/services/photos_cloudkit/client.py
rg -nP -C3 'PhotosUploadResponse\.model_validate|except ValidationError|test_upload_file_.*malformed|ValidationError' \
  pyicloud/services/photos_cloudkit/client.py tests/services/test_photos_cloudkit_client.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyicloud/services/photos_cloudkit/client.py` around lines 191 - 193, The
PhotosUploadResponse.model_validate call in the upload path can raise a pydantic
ValidationError which currently leaks out; wrap the call inside a try/except
that catches pydantic.errors.ValidationError (or pydantic.ValidationError) and
re-raise a CloudKitApiError with a clear message and include original
error/context (use the same context string used in _response_json like "Photos
upload"); update the upload handler where PhotosUploadResponse,
self._response_json and variable data are used to ensure any validation failures
are converted to CloudKitApiError consistent with how batch_count errors are
normalized.
🧹 Nitpick comments (1)
pyicloud/services/photos_cloudkit/service.py (1)

633-669: Dead if cls is PhotoAlbum branch — both arms are identical.

After cls is potentially reassigned to PhotoAlbumFolder at line 634, the if cls is PhotoAlbum: return cls(...) block at lines 635–652 and the return cls(...) at lines 653–669 pass the same keyword arguments. The conditional therefore has no behavioural effect and is pure duplication — the whole block can collapse to a single return cls(...).

♻️ Proposed simplification
-        if cls is PhotoAlbum:
-            return cls(
-                library=self,
-                name=album_name,
-                record_id=record_name(record),
-                obj_type=obj_type,
-                list_type=list_type,
-                direction=direction,
-                query_filter=query_filter,
-                client=self._client,
-                zone_id=self.zone_id,
-                query_filters=typed_query_filters,
-                parent_id=cast(Optional[str], record_field_value(record, "parentId")),
-                record_change_tag=record_change_tag(record),
-                record_modification_date=record_field_value(
-                    record, "recordModificationDate"
-                ),
-            )
         return cls(
             library=self,
             name=album_name,
             ...
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyicloud/services/photos_cloudkit/service.py` around lines 633 - 669, The two
branches returning cls(...) are identical and the conditional if cls is
PhotoAlbum is dead; remove the duplication by deleting the conditional branch
and collapse to a single return cls(...) after the cls assignment (where cls may
be set to PhotoAlbumFolder when album_type is AlbumTypeEnum.FOLDER), preserving
all existing keyword arguments (library, name, record_id via
record_name(record), obj_type, list_type, direction, query_filter,
client=self._client, zone_id=self.zone_id, query_filters=typed_query_filters,
parent_id=cast(Optional[str], record_field_value(record, "parentId")),
record_change_tag=record_change_tag(record),
record_modification_date=record_field_value(record, "recordModificationDate"))
so behavior is unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pyicloud/common/cloudkit/client.py`:
- Around line 85-87: The debug log is printing full CloudKit URLs (which may
contain dsid query params or signed asset URLs); update the CloudKit client
logging in the post method (def post(...)) and the corresponding get method to
avoid leaking sensitive query parameters by logging only the endpoint path or a
redacted placeholder instead of the full URL (e.g., use the input path or strip
query string/signed tokens before logging); ensure LOGGER.debug calls no longer
reference the full URL built by build_url but a safe representation.

In `@pyicloud/services/photos_cloudkit/materialize.py`:
- Around line 104-107: The code currently calls taken_at.astimezone() before
formatting for EXIF which shifts the asset's wall-clock time; remove the
.astimezone() call and pass taken_at.strftime("%Y:%m:%d %H:%M:%S") directly to
_insert_exif_datetime_segment (i.e., change the timestamp argument in the call
that assigns updated) so the original timezone-aware timestamp returned by
_asset_datetime() is preserved.

In `@pyicloud/services/photos_legacy.py`:
- Around line 1228-1259: The legacy PhotoAlbum._get_len implementation assumes
the batch response is fully populated and directly indexes
response["batch"][0]["records"][0"]["fields"]["itemCount"]["value"], which
raises IndexError/KeyError on empty or error-shaped payloads and aborts album
iteration; update _get_len to mirror the hardened logic used in
photos_cloudkit/service.py::PhotoAlbum._get_len by validating the response
structure (check "batch", non-empty records, "fields" and "itemCount"/"value")
and catching IndexError/KeyError/TypeError/ValueError around the extraction, log
a clear message via the same logger used elsewhere, and return 0 on
missing/error cases so callers (e.g. photos generator using len(self) - 1)
continue gracefully; reference symbols: _get_len, self.service.session.post,
self._get_container_id, self._zone_id.
- Around line 1037-1041: The current PhotoAlbum.fullname implementation can
raise KeyError or recurse infinitely via malformed/cyclic parent links; change
fullname to perform an iterative upward walk using a visited set to detect
cycles and stop if a parent id is missing instead of indexing AlbumContainer
directly (avoid relying on AlbumContainer.__getitem__ to always succeed). Mirror
the hardened approach used in PhotoAlbum.fullname in photos_cloudkit/service.py:
start from self, loop up reading parent ids from self._parent_id and retrieving
parents from self._library.albums only if present, break and mark missing/stop
if a parent lookup fails, and return a joined path built from the collected
names; ensure cycles are detected by tracking seen ids to prevent infinite
loops.

---

Duplicate comments:
In `@pyicloud/services/photos_cloudkit/client.py`:
- Around line 191-193: The PhotosUploadResponse.model_validate call in the
upload path can raise a pydantic ValidationError which currently leaks out; wrap
the call inside a try/except that catches pydantic.errors.ValidationError (or
pydantic.ValidationError) and re-raise a CloudKitApiError with a clear message
and include original error/context (use the same context string used in
_response_json like "Photos upload"); update the upload handler where
PhotosUploadResponse, self._response_json and variable data are used to ensure
any validation failures are converted to CloudKitApiError consistent with how
batch_count errors are normalized.

---

Nitpick comments:
In `@pyicloud/services/photos_cloudkit/service.py`:
- Around line 633-669: The two branches returning cls(...) are identical and the
conditional if cls is PhotoAlbum is dead; remove the duplication by deleting the
conditional branch and collapse to a single return cls(...) after the cls
assignment (where cls may be set to PhotoAlbumFolder when album_type is
AlbumTypeEnum.FOLDER), preserving all existing keyword arguments (library, name,
record_id via record_name(record), obj_type, list_type, direction, query_filter,
client=self._client, zone_id=self.zone_id, query_filters=typed_query_filters,
parent_id=cast(Optional[str], record_field_value(record, "parentId")),
record_change_tag=record_change_tag(record),
record_modification_date=record_field_value(record, "recordModificationDate"))
so behavior is unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5ed583f2-7dbb-491d-aac7-fed55f835f3b

📥 Commits

Reviewing files that changed from the base of the PR and between 708710c and f424a10.

📒 Files selected for processing (14)
  • examples.py
  • pyicloud/cli/commands/photos.py
  • pyicloud/common/cloudkit/client.py
  • pyicloud/services/photos_cloudkit/client.py
  • pyicloud/services/photos_cloudkit/mappers.py
  • pyicloud/services/photos_cloudkit/materialize.py
  • pyicloud/services/photos_cloudkit/service.py
  • pyicloud/services/photos_cloudkit/sync.py
  • pyicloud/services/photos_legacy.py
  • tests/fixtures/photos_browser_mutations/README.md
  • tests/services/test_photos.py
  • tests/services/test_photos_cloudkit_client.py
  • tests/services/test_photos_sync.py
  • tests/test_cmdline.py
✅ Files skipped from review due to trivial changes (2)
  • tests/fixtures/photos_browser_mutations/README.md
  • pyicloud/services/photos_cloudkit/mappers.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • pyicloud/services/photos_cloudkit/sync.py

Comment thread pyicloud/common/cloudkit/client.py Outdated
Comment thread pyicloud/services/photos_cloudkit/materialize.py
Comment thread pyicloud/services/photos_legacy.py Outdated
Comment thread pyicloud/services/photos_legacy.py Outdated
@timlaing
Copy link
Copy Markdown
Owner

timlaing commented May 8, 2026

@MrJarnould - thanks again for your support and updates. Are you able to resolve the merge conflicts and I'll get this merged. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants